perm filename 2[00,BGB] blob sn#047821 filedate 1973-06-07 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00011 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	~F8THE ALGORITHM: INTRODUCTION.
C00005 00003	~F81. THRESHOLDING.
C00009 00004	~F82. CONTOURING.
C00016 00005	~F83. NESTING.
C00022 00006	~F83. NESTING.
C00025 00007	"	The image tree  is generated one  threshold level at  a time,
C00030 00008	~F84. SMOOTHING.
C00036 00009	~F85. COMPARING.
C00040 00010	~F85. COMPARING.
C00044 00011	~F8SUMMARY OF LAMINA INERTIA TENSOR EXPRESSIONS.
C00047 ENDMK
C⊗;
~F8THE ALGORITHM: INTRODUCTION.

	CRE  consists  of  five  steps:  thresholding,    contouring,
nesting,    smoothing  and  comparing.  Thresholding, contouring  and
smoothing perform conversions between two different kinds  of images.
Nesting  and contouring  compute topological  relationships within  a
given image representation. In summary the major operations are:

	MAJOR OPERATION.	OPERAND.	  RESULT.

	1. THRESHOLDING: 	6-BIT-IMAGE,	  1-BIT-IMAGES.
	2. CONTOURING: 	 	1-BIT-IMAGES,	  VIC-IMAGE.
	3. NESTING:		VIC-IMAGE,	  NESTED-VIC-IMAGE.
	4. SMOOTHING:	 	VIC-IMAGE,	  ARC-IMAGE.
	5. COMPARING:		IMAGE & FILM,	  FILM.

	Although the natural  order of operations is  sequential from
image thresholding  to image comparing; in order  to keep memory size
down, the first four steps are applied one intensity level at  a time
from the  darkest cut to  the lightest  cut (only nesting  depends on
this  sequential  cut  order);  and  comparing  is  applied to  whole
images.

	The illustrations on  pages 21 and  23 show an initial  video
image  and  its  final  smoothed  contour  image;  the  illustrations
immediately below  and  on  page 24  the  corresponding  intermediate
sawtoothed contour images.  The illustrated  images are each composed
of  seven intensity  levels, and  took 16 seconds  and 13  seconds to
compute respectively (on  a PDP-10,   2usec memory).   The final  CRE
data structures contained 680 and  293 nodes respectives, which comes
to  2K and 4.5K words respectively;  the initial video image requires
10.2K words.

FIGURE: PUMP SAW TOOTHED CONTOURS.
~I1973,800;F8- 22 -
~F81. THRESHOLDING.

	Thresholding, the  first and easiest  step,  consists  of two
subroutines,    called THRESH  and  PACXOR. THRESH  converts  a 6-bit
image into a 1-bit image with respect to a given threshold  cut level
between zero  for black and  sixty-three for light. All  pixels equal
to or  greater than the cut, map into a one; all the pixels less than
the cut, map into zero. The resulting 1-bit image is  stored in a bit
array  of  216  rows by  288  columns  (1728  words) called  the  PAC
(picture accumulator)  which  was  named  in  memory  of  McCormick's
ILLIAC-III.  After THRESH,   the PAC contains blobs of bits.   A blob
is defined as  "rook's move" simply connected; that is every bit of a
blob can be reached  by horizontal or  vertical moves from any  other
bit without having to  cross a zero bit or having  to make a diagonal
(bishop's)  move.  Blobs may of course  have holes.  Or equalvalently
a blob always  has one outer  perimeter polygon,   and may have  one,
several or  no inner perimeter polygons. This  blob and hole topology
is recoverible  from the  CRE  data structure  and  is built  by  the
nesting step.

	Next,   PACXOR copies the  PAC into  two slightly larger  bit
arrays named HSEG and  VSEG. Then the PAC is shifted down one row and
exclusive OR'ed into  the HSEG array;  and the  PAC is shifted  right
one column  and exclusive OR'ed  into the  VSEG array to  compute the
horizontal and  vertical border bits of the PAC blobs.  Notice,  that
this is the very heart  of the "edge finder" of CRE. Namely,   PACXOR
is the mechanism that converts regions into edges.
~I1973,800;F8- 24 -
~F82. CONTOURING.

	Contouring,   converts  the  bit arrays  HSEG  and VSEG  into
vectors  and polygons.  The  contouring itself,  is  done by a single
subroutine named MKPGON,  make polygon.   When MKPGON is called,   it
looks for the upper most left non-zero  bit in the VSEG array. If the
VSEG  array is empty, MKPGON returns a  NIL. However, when the bit is
found, MKPGON traces and  erases the polygonal outline to  which that
bit belongs and returns a polygon node with a ring of vectors.

	To belabor the details  (for the sake of later complexities);
the MKGON trace  can go  in four  directions: north  and south  along
vertical columns of  bits in the VSEG  array, or east and  west along
horizontal rows of  the HSEG array. The trace starts by heading south
until it hits a turn; when heading south MKPGON must check  for first
a turn  to the east (indicated  by a bit  in HSEG); next for  no turn
(continue  south); and last  for a turn  to the west. When  a turn is
encountered MKPGON  creates a  vector  node representing  the run  of
bits  between the  previous turn  and the  present turn.    The trace
always ends heading west bound.  The outline so traced can be  either
the edge  of a blob  or a  hole, the two  cases are distinguished  by
looking  at the VIC-polygon's  uppermost  left  pixel in  the PAC bit
array.

	There  are   two  complexities:  contrast   accumulation  and
dekinking.    The  contrast  of  a  vector  is defined  as  (QUOTIENT
(DIFFERENCE (Sum of pixel  values on one  side of the vector)(Sum  of
pixel values on the other side  of the vector)) (length of the vector
in pixels)).   Since vectors are always either horizontal or vertical
and  are  construed as  being  on  the  cracks  between  pixels;  the
specified summations  refer to the pixels immediately  to either side
of the vector. Notice that  this definition of constrast will  always
give  a  positive  constrast for  vectors  of  a  blob  and  negative
contrast for the vectors of a hole.

	The terms "jaggies",   "kinks" and "sawtooth" all are used to
express what seems to  be wrong about  the lowermost left polygon  on
page  25.   The problem  involves  doing something  to a  rectilinear
quantized  set of segments,  to make  its linear nature more evident.
The CRE jaggies solution (in the subroutine  MKPGON) merely positions
the  turning  locus diagonally  off  its grid  point  alittle in  the
direction (northeast,    northwest,   southwest  or  southeast)  that
bisects the  turn's  right angle.   The  distance  of dekink  vernier
positioning  is  always  less than  half  a  pixel;  but greater  for
brighter cuts and less for the darker cuts; in order to  preserve the
nesting of contours.  The saw  toothed and the dekinked versions of a
polygon  have the  same number of  vectors.  I  am very  fond of this
dekinking algorithm because of its incredible  efficiency; given that
you have  a north,  south,  east,  west polygon  trace routine (which
handles image  coordinates packed row, column  into  one  accumulator
word);  then  dekinking  requires  only   one  more  ADD  instruction
execution per vector ! ~I1973,800;F8- 26 -
~F83. NESTING.


	The nesting problem is to decide  whether one contour polygon
is  within another.  Although  easy in the two  polygon case; solving
the nesting  of many  polygons  with respect  to each  other  becomes
n-squared expensive in  either compute time or in memory  space.  The
nesting  solution in  CRE sacrifices memory  for the  sake of greater
speed and requires a 31K array, called the SKY.


	CRE's accumulation  of  a properly  nested  tree of  polygons
depends on the  order of threshold cutting going  from dark to light.
For each polygon there are two  nesting steps: first, the polygon  is
placed in  the  tree of  nested polygons  by  the subroutine  INTREE;
second,   the polygon  is placed in  the SKY array  by the subroutine
named INSKY.


	The SKY array is 216 rows of 289 columns of  18-bit pointers.
The name "SKY"  came about because,  the array  use  to represent the
furthest  away regions or  background, which  in the case  of a robot
vehicle is the real sky  blue. The sky contains vector  pointers; and
would be  more efficient on  a virtual memory  machine that didn't
allocate unused pages of its  address space.  Whereas most  computers
have more  memory containers  than address  space; computer  graphics
and vision  might be easier to program in  a memory with more address
space than physical space; i.e. an almost empty virtual memory.


	The first part of the INTREE routine  finds the surrounder of
a given polygon  by scanning the SKY due east from the uppermost left
pixel of  the given polygon.   The  SON of  a polygon  is always  its
uppermost  left  vector. After  INTREE,    the INSKY  routine  places
pointers  to the vertical vectors  of the given  polygon into the sky
array.


	The second part  of the INTREE  routine checks for  and fixes
up the case  where the new polygon captures a polygon that is already
enclaved. This  only happens when  two or  more levels  of the  image
have blobs that have holes.   The next paragraph expalains the arcane
details of  fixing up the tree links of multi level hole polygons and
the box following that is  a quotation from the appendix  of Krakauer
thesis [3] describing his nesting algorithm.
~F83. NESTING.

	Let the given polygon  be named Poly; and let  the surrounder
of Poly be  called Exopoly; and assume that Exopoly surrounds several
enclaved polygons called  "endo's", which are  already in the  nested
polygon tree. Also, there are two  kinds of temporary lists named the
PLIST and  the NLIST. There is one PLIST which is initially a list of
all the ENDO polygons on  Exopoly's ENDO ring. Each endo in  turn has
an NLIST  which is initially  empty.  The  subroutine INTREE re-scans
the sky array for the polygon  due east of the uppermost left  vector
of each  endo polygon on  the PLIST, (Exopoly's  ENDO ring).  On such
re-scanning, (on behalf of say an Endo1), there are four cases:

	1. No change; the scan returns Exopoly;
	   which is Endo1's original EXO.
	2. Poly captures Endo1; the scan returns Poly indicating
	   that endo1 has been captured by Poly.
	3. My brothers fate; the scan hits an endo2 which
	   is not on the PLIST; which means that endo2's EXO is valid
	   and is the valid EXO of endo1.
	4. My fate delayed; the scan hits an endo2 which is still
	   on the PLIST; which means that endo2's EXO is not yet
	   valid but when  discovered it will also be Endo1's EXO;
	   so Endo1 is CONS'ed into Endo2's NLIST.

When an endo  polygon's  EXO has  been  re-discovered, then  all  the
polygons on  that endo's NLIST are also  placed into the polygon tree
at that place. All of this link crunching machinery takes half a page
of code and is not frequently executed.
"	The image tree  is generated one  threshold level at  a time,
starting  at the  highest  level (branch  tips). At  each  level, the
image is scanned, and the points above the threshold are marked  in a
scratch array. This  scatch array is then scanned  for marked points.
When  one is found, a contiguity routine  is called, which visits all
marked points which  can be  reached from the  start via a  connected
path.    The marks  are  erased by  this  routine as  it  goes,   and
statistics are kept on the region  thus generated,  such as the  sums
of the  x  and y  coordinates of  the points,    and the  sum of  the
squares  of the x  and y coordinates  (used to  compute the centerand
the eccentricity). A tree  node is then made  up for the region,  and
the scan for marked points continues.   A special mark is left in the
scratch  array for each region. When  this mark is encountered during
the scan at the next  level, it is looked up on an  association list.
This establishes the link between  a region and the regions which are
a subset of it at the previous level  - i.e.  between a node and  its
sub-nodes.	"

"	The contiguity scan is the most complex  program. It works by
leaving  directional   pointers  in  the  scratch  array.  These  are
three-bit  codes denoting  one  of  the  eight  possible  neighboring
points. The contiguity scan is always  started at a point which is on
the  bottom edge  of the  region. It  traces along  this edge  to the
right by  moving  from  one marked  point  to  the next,  but  always
keeping an un-marked  point to the right side. As  it goes, it erases
the marks,  so that  for a  region with  smooth boundaries,  it  will
follow a  spiral path  to the  center, "eating  up" the  marks as  it
goes,  like a  lathe  with the  tool continually  advancing  into the
work.	"

"	As the contiguity routine  scans, it lays down back  pointers
in the scratch array which enable it  to retrace its path back to the
start.  If  a dead  end  is reached  (no more  marked  neighbors), it
traces back along this path, looking for marked points  to the right.
There can  be no marked points  on the left  side while backtracking,
since this was the right side on  the way out, and the outgoing  scan
stayed as far  to the right as possible.  If a marked point  is found
on the backtrace,  it is replaced with a pointer to the adjacent path
already traced out, and then a new  path is traced as if this were  a
new starting point. When the  backtrace reaches the original starting
point,  the   contiguity  scan  is  completed.  The  effect  of  this
algorithm is to  construct a tree of  pointers in the scratch  array,
with the starting point at  the root. All points which can be reached
via a connected path from the starting  point will be a part of  this
tree.	"
~F84. SMOOTHING.


	In CRE  the term "smoothing"  refers more  to the problem  of
breaking a  manifold (polygon) into functions  (arcs), rather than to
the problem  of fitting  functions to  measured  data. The  smoothing
step, converts the  polygons of vertical and  horizontal vectors into
polygons of  arcs.  For the present the term "arc" means "linear arc"
which is  a line  segment. Fancier  arcs: circular  and cubic  spline
were implemented  and thrown out mostly  because they were  of no use
to higher processes  such as  the polygon compare  which would  break
the fancier arcs back  down into linear vectors for  computing areas,
inertia tensors or mere display buffers.


	Smoothing  is applied to each  polygon of a level.   To start
the smoothing, a ring of two  arcs is formed (a bi-gon) with one  arc
at the  uppermost left and  the other at  the lowermost right  of the
given vector  polygon. Next a recursive make arc operation, MKARC, is
is appled to the  two initial arcs. Since  the arc given to  MKARC is
in a one  to one correspondece with a doubly  linked list of vectors;
MKARC checks to  see whether  each point on  the list  of vectors  is
close enough to the  approximating arc.  MKARC returns  the given arc
as  good enough when all  the sub vectors fall  within a given width;
otherwise MKARC splits the arc in two and places a new arc  vertex on
the vector vertex that was furthest away from the original arc.


	The  two large  images  in figure-7,    illustrate a  polygon
smoothed  with arc  width tolerances set  at two  different widths in
order to  show one  recursion of  MKARC.   The  eight smaller  images
illustrate  the results  of setting  the arc  width tolerance  over a
range of values. Because of  the dekinking mentioned earlier the  arc
width tolerance  can be equal  to or less  than 1.0 pixels  and still
expect a  substantial reduction in the number  of vectors it takes to
describe a contour polygon.


	A  final  important  smoothing detail is  that the arc  width
tolerance is  actually taken  as a function  of the  highest contrast
vector  found along the arc; so that  high contrast arcs are smoothed
with much smaller  arc width tolerances  than are low contrast  arcs.
After smoothing, the  contrast across  each arc  is computed  and the
ring of  arcs replaces  the ring  of vectors  of the  given  polygon.
(Polygons that would be expressed as only two arcs are deleted).
~I1973,800;F8- 30 -
~F85. COMPARING.

	The compare step of  CRE,  CMPARE, connects the  polygons and
arcs  of the current  image with  corresponding polygons and  arcs of
the  previous image.    CMPARE  solves  the  problem  of  correlating
features  between  two  similar  images  and  is  composed  four  sub
sections: 

	1. make shape nodes for polygons.
	2. compare and connect polygons one to one.
	3. compare and connect polygons two to one.
	4. compare and connect vertices of connected polygons.

	First,  the shape  nodes of all the polygons of  an image are
computed. The  shape node contains the center  of mass and the lamina
inertia tensor of a polygon.  The lamina inertia tensor of  a polygon
with  N sides  is  computed  by summation  over  N  trapezoids.   The
trapezoid   corresponding  to   each  side  is   formed  by  dropping
perpendiculars  "up"  to the  top  of  the  image  frame;  each  such
trapezoid  consists of  a rectangle  an a  right triangle;  since the
sides of polygons  are directed  vectors the areas  of the  triangles
and rectangles can be  arranged to take positive and  negative values
such  that  a summation  will  describe the  interior  region  of the
polygon  as positive.  The  equations  necessary  for  computing  the
lamina inertia  tensor of a polygon  are collected in a  table in the
postscripts  to  this paper  and  were derived  by  using Goldstein's
Classical Mechanics [1] as a  reference.  The meaning of  the inertia
tensor  is that it  characterizes each  polygon by  a rectangle  of a
certain length and width at a particular location and oriention;  and
of  further  importance  such  inertia  tensors  can  be  "added"  to
characterize two  or more polygons by a single  rectangle.  It is the
lamina inertia tensor rectangles that are actually compared by CRE.

	Second, all the shapes  of the polygons  of one level of  the
first image are  compared with all the shapes of  the polygons of the
corresponding level  of the second image for nearly exact match.  The
potentially (M*N/2) compares is  avoided by sorting on the  center of
mass locations.   In CRE,  which is intended  for comparing sequences
of pictures of natural scenes; match  for center of mass location  is
tested first  and  most strictly,   followed  by  match for  inertia.
Pointers  between  matching  polygons are  placed  in  the time  link
positions of the polygon nodes and the polygons are considered  to be
mated in time. ~I1973,800;F8- 32 -
~F85. COMPARING.

	Third,  all  the unmated polygons  of a level  are considered
two  at a time and  a fusion shape node  for each pair is  made.  The
potentially (N*N/2-N) fusion  shapes are avoided  because there is  a
maximum possible  unmated inertia  in the other  image; lo,  if there
are  no unmated polygons in one image  then the extra polygons of the
first image  can be  ignored. In the  event where  there are  unmated
polygons  in  corresponding levels  of  the  two  images, the  fusion
shapes of one  are compared  with the  polygon shapes  of the  other.
The  fusion  (fission)  compare  solves  the  rather  nasty  problem,
illustrated in  figures 9A and 9B of  linking two contour polygons of
one image with a single contour polygon in the next image.

	Fourth, the vertices of  polygons mated in time are  compared
and mated.  To start a  vertex compare,  the vertices of  one polygon
are  translated,   rotated and dilated  to get  that polygon's lamina
inertia tensor  coincidant with  its mate  (or mates).  Conceptually,
each vertex of one polygon  is compared with each vertex of the other
polygon(s)  and  the  mutually  closest  vertices  (closer  than   an
epsilon) are  considered to  be mated.  Actually the potential  (N*M)
compares  is avoided by  a window  splitting scheme similiar  to that
used in hidden line elimination algorithms (like Warnock's).

	The results  of vertex compare  and mate  are illustrated  in
figures 9A and 9D; the compare  execution takes less than a second on
images  such as the  pump,  blocks,  and dolls that  have appeared in
this  paper. The  applications  of  this compare  might  include  the
aiming   of  a  pixel   correlation  comparator  (such   as  Quam's);
recognition and location of an  expected object; or the location  and
extent of  an unknown  object.   It is  this latter application  that
will be described in my forthcoming thesis.
I1973,800;F8- 33 -
~F8SUMMARY OF LAMINA INERTIA TENSOR EXPRESSIONS.

RECTANGLE'S  LAMINA INERTIA TENSOR ABOUT ITS CENTER OF MASS.

	MXX	←	B*B*AREA/12;		(B HEIGHT IN ROWS).
	MYY 	←	A*A*AREA/12;		(A WIDTH IN COLUMNS).
	MZZ	←	MXX + MYY;
	PXY 	←	0;

ORIENTED RIGHT TRIANGLE'S LAMINA INERTIA TENSOR ABOUT ITS CENTER OF MASS.

	MXX	←	B*B*AREA/18;		(B HEIGHT IN ROWS).
	MYY	←	A*A*AREA/18;		(A WIDTH IN COLUMNS).
	MZZ	←	MXX + MYY;
	PXY	←	-A*B*AREA/36;

SUMMATION OF LAMINA INERTIA TENSORS.

	AREA	←	(AREA1  +  AREA2);
	XCM	←	(AREA1 * XCM1  +  AREA2 * XCM2) / AREA;
	YCM	←	(AREA1 * YCM1  +  AREA2 * YCM2) / AREA;
	MXX	←	MXX1 + YCM1*YCM1*AREA1  +
			MXX2 + YCM2*YCM2*AREA2  -  YCM*YCM*AREA;
	MYY	←	MYY1 + XCM1*XCM1*AREA1  +
			MYY2 + XCM2*XCM2*AREA2  -  XCM*XCM*AREA;
	PXY	←	PXY1 - XCM1*YCM1*AREA1  +
			PXY2 - XCM2*YCM2*AREA2  +  XCM*YCM*AREA;

ANGLE OF PRINCIPLE AXIS

	PHI	←	0.5*ATAN((MYY-MXX)/(2*PXY))
	PXY	←	0.5*(MYY - MXX)*TAN(2*PHI)

TRANSLATION OF LAMINA INERTIA TENSOR AWAY FROM CENTER OF MASS.
	
	MXX'	←	MXX  +  AREA*DY*DY;
	MYY'	←	MYY  +  AREA*DX*DX;
	PXY'	←	PXY  -  AREA*DX*DY;

ROTATION OF LAMINA INERTIA TENSOR ABOUT CENTER OF MASS.

	C	←	COSINE(PHI);
	S	←	SINE(PHI);
	MXX'	←	C*C*MXX  +  S*S*MYY  -  2*C*S*PXY;
	MYY'	←	C*C*MYY  +  S*S*MXX  +  2*C*S*PXY;
	PXY'	←	(C*C - S*S)*PXY + C*S*(MYY - MXX);


~I1973,800;F8- 51 -